Ultrasound radiomics based on axillary lymph nodes images for predicting lymph node metastasis in breast cancer

Objectives To determine whether ultrasound radiomics can be used to distinguish axillary lymph nodes (ALN) metastases in breast cancer based on ALN imaging. Methods A total of 147 breast cancer patients with 41 non-metastatic lymph nodes and 109 metastatic lymph nodes were divided into a training set (105 ALN) and a validation set (45 ALN). Radiomics features were extracted from ultrasound images and a radiomics signature (RS) was built. The Intraclass correlation coefficients (ICCs), Spearman correlation analysis, and least absolute shrinkage and selection operator (LASSO) methods were used to select the ALN status–related features. All images were assessed by two radiologists with at least 10 years of experience in ALN ultrasound examination. The performance levels of the model and radiologists in the training and validation subgroups were then evaluated and compared. Result Radiomics signature accurately predicted the ALN status, achieved an area under the receiver operator characteristic curve of 0.929 (95%CI, 0.881-0.978) and area under curve(AUC) of 0.919 (95%CI, 95%CI, 0.841-0.997) in training and validation cohorts respectively. The radiomics model performed better than two experts’ prediction of ALN status in both cohorts (P<0.05). Besides, prediction in subgroups based on baseline clinicopathological information also achieved good discrimination performance, with an AUC of 0.937, 0.918, 0.885, 0.930, and 0.913 in HR+/HER2-, HER2+, triple-negative, tumor sized ≤ 3cm and tumor sized>3 cm, respectively. Conclusion The radiomics model demonstrated a good ability to predict ALN status in patients with breast cancer, which might provide essential information for decision-making.


Introduction
Breast cancer (BC) is the most commonly diagnosed cancer among women and accounts for 12.5% of all new annual cancer cases worldwide (1,2).B etween 30.2-69.8% of BC patients have lymph node metastases (3).Acquisition of the regional lymph node status is necessary to achieve precision therapy and a good prognosis (4).Some studies use clinical and pathological features of breast tumors, such as molecular subtype and maximum lesion diameter to predict axillary lymph node tumor burden (5).However there is no consensus on the point of predicting lymph node metastasis based on clinicalpathological characteristics.
Currently, the main approach to gaining lymph node status is sentinel node biopsy (6).However, this is an invasive method, with potential complications of arm pain, hematoma, seroma, lymphedema, and infection.Clinical trial ACOSOG Z0011 has shown that patients with limited sentinel lymph node metastatic breast cancer who received sentinel lymph node dissection (SLND) alone compared with ALN dissection did not lead to an inferior survival (7).However, the false negative rate of SLND ranges from 7.8% to 27.3%, which cannot be ignored, which is a common problem in patients with risk factors such as upper outer breast cancer and lead to adverse consequences, including incorrect tumor staging and increasing the risk of recurrence (8)(9)(10).There is no highly accurate and non-invasive method for the identification of ALN metastases in breast cancer at present.
Preoperative noninvasive ALN assessment methods include axillary ultrasonography (US), magnetic resonance imaging, and mammography.Axillary US can evaluate nodal morphology in realtime and guide fine-needle biopsies.Asian women have higherdensity breasts than other ethnic groups (11).Besides, female patients in Asian countries were mainly concentrated in a younger age group (12).Thus, ultrasound (US) has become an effective method for diagnosing breast neoplasm and ALN lesions (13).It can benefit for preoperative evaluation of ALN status and help choose patients with an extremely low possibility of non sentinel lymph node(SLN) metastasis, for whom ALN dissection can be omitted (14).However, ultrasound has several defects, such as the high dependency on radiologists.Unnecessary biopsies may be caused when images were evaluated by inexperienced radiologists, and the diagnostic performance of axillary US was poor in determining the ALN status (15).Therefore, quantitative and non-invasive methods are still needed to predict ALN metastases of breast cancer (16).
With the rapid development of artificial intelligence (AI), AI has been widely used for processing large sets of medical images, including image reconstruction, image segmentation, analysis, and model prediction, leading to a boom in radiomics.Radiomics is the application of bioinformatics methods to extract multiple quantitative imaging features from medical images, which can obtain additional information to predict potential tumor biological behavior.Plenty of studies have shown good performance in using a radiomics approach to improve the accuracy of malignant lesion discrimination and facilitate the classification of tumor types and grades (17).However, in studies using ultrasound radiomics approaches, few analyses were based on ALN images and aimed to illuminate whether radiomics is capable of classifying enlarged axillary lymph nodes.
The aim of our study was to devise a model able to predict the ALN metastatic status based on radiomics features extracted from ultrasound images of ALN in patients with breast cancer.

Patients and clinicopathological information
This retrospective study was approved by the ethics committee of the Yueyang Central Hospital and informed consent was waived.Patients with breast cancer in two hospitals were evaluated, 259 and 40 ALN ultrasound images were obtained from Hunan Cancer Hospital (Hospital 1) and Yue Yang Central Hospital (Hospital 2) respectively.These images were produced by ultrasound instrument of Super Sonic Imagine.The flowchart of the study population is shown in Figure 1.
The inclusion criteria were as follows: (i) the patient with qualified images; (ii) the patient with complete pathologic information; (iii) the patient with complete baseline characteristics.The exclusion criteria were as follows: (i) the image was blurred or has been artificially marked; (ii) the patient has received chemotherapy before ultrasound examination; (iii) incomplete baseline characteristics.
The baseline clinicopathological information was derived from the patient medical record, including age, tumor size, pathological findings, and immunohistochemical (IHC) results of estrogen receptor (ER), progesterone receptor (PR), and HER2 status.For IHC characteristics, ≥1% of cell staining was considered a positive ER/PR, and <1% of cell staining was considered a negative ER/PR (18).We defined HER2 as positive if the IHC result was +3 or the FISH result was positive, otherwise the HER2 status was considered negative (19).

US image acquisition
B-mode ultrasound and color Doppler flow images were acquired with a Super Sonic Aixplorer system (Super Sonic Imagine, Aix-en-Provence, France) using a 5-14 MHz linear transducer.The patient was placed in a supine or contralateralside-down oblique position on the table, with the ipsilateral hand placed behind the head.US scanning typically started from the lower part of the axilla and continued upward toward the axillary fossa.Transverse and sagittal planes were imaged.ALL images in the two hospitals were obtained by two senior radiologists complying with the same protocol, so that can we reduce the deviation caused by different operators.
Target ALN segmentation and radiomics feature extraction 150 ALN were eligible for the inclusion criteria, among them 132 lymph nodes were selected corresponding to the ultrasoundguided biopsy images, usually the biggest ipsilateral lymph node.Biopsies of ALN were performed under the guidance of ultrasound by using an 18G core needle.And 18 lymph nodes without preoperative biopsy were considered non-ALN metastasis, because their postoperative pathological results were lymph node negative, indicating there is no metastasis on this side armpit.One US image with the largest diameter of each ALN lesion was used for analysis.The region of interest (ROI) was manually delineated on the US image using ITK-SNAP 3.8 software (http:// www.itksnap.org).At the initial stage, the manual segmentations of 150 images were performed by (Y.-L.T.), a breast surgeon who received breast ultrasound training.To evaluate interobserver reliability, all images were manually re-delineated by (T.O-Y.), a senior radiologist with 10 years of US experience.They both finished without knowing the pathological results.The twodimensional ROI of the ALN was depicted on the ultrasound image and the radiomics features were extracted automatically from each image by using the open-source python package Pyradiomic (https://pyradiomics.readthedocs.io/en/latest/)(20).To evaluate the intraobserver reliability, the ROI segmentation of 50 randomly chosen images in a blind method was performed by (Y.-L.T.) two weeks later.Intraclass correlation coefficients (ICCs) greater than 0.75 indicate good agreement of ALN segmentation (21,22).The process is presented in Figure 2.

Radiomics model construction and evaluation
Differences between the negative and positive ALN in training and validation cohorts were determined by the Mann-Whitney test (non-normal distribution) or t-test (normal distribution), p<0.05.We used Spearman's correlation coefficient to evaluate the redundancy of the features, and eliminated features with a Spearman correlation coefficient ≥ 0.9, with only the most reliable one left for further analysis.And supervised learning algorithm was applied to select those most representative features.The least absolute shrinkage and selection operator (LASSO) regression using tenfold cross-validation was applied to select the most predictive ALN status-related features from the training set (23).The formulas for the US radiomics signature were built using the respective selected feature.After that, the radiomics signature (RS) was built to predict the ALN metastasis in breast cancer.The discrimination ability was evaluated using the area under the receiver operator characteristic (AUROC) curve.The optimal cutoff value was calculated with the Youden index.The performance of the optimal cut-off value was assessed by diagnostic sensitivity, specificity, and accuracy.Furthermore, radiomics model performance in subgroups was conducted.The subgroups were set based on baseline clinicopathological information.

Radiologist evaluation
The US images of ALN were assessed by two radiologists without knowing pathological results (R1:L.Q. and R2:S.-C.T., with 15 and 30 years of experience respectively), based on cortex, morphology, margins, and lymphatic hilum status of lymph nodes (24).US images were reviewed by expert radiologists and binary classification was made (N0 or NX).The areas under the AUROC of the two radiologists was calculated respectively.The diagnostic sensitivity, specificity, and accuracy were calculated.And then we evaluated two radiologist's performance in subgroups.

Statistical analysis
The DeLong test was calculated to distinguish the differences between AUCs.Statistical analysis was conducted using SPSS 21 software (SPSS Inc., Chicago, IL).All levels of statistical significance are bilateral, with a P value less than 0.05.In univariate analysis, the differences in clinical characteristics between the patients of different groups were compared using the Mann-Whitney U test for continuous variables, and the c2 test for categorical variables.The False Discovery Rate was calculated by using the Benjamini-Hochberg method.The baseline characteristics of patients and pathological results in the training and validation cohorts are displayed in Table 1.There were no significant differences between these two cohorts in age, breast tumor size, status of HR, HER2.Among the total 150 ALN, according to the results of pathological results, 77 and 33 were positive ALN, and 28 and 12 were negative ALN in the training and validation cohorts, respectively.There was no significant difference in ALN status between the two cohorts.Among the 110 metastatic lymph nodes of cancer, 56 had breast tumors larger than 3 cm and 54 had tumors no larger than 3 cm, while among 40 non-metastatic lymph nodes, 15 had breast tumors larger than 3cm and 25 had tumors no larger than 3 cm.

Feature selection and construction of radiomics model
Radiomics features were extracted from each US image and a total of 651 imaging features were obtained.A total number of 614 features were thought to be robust (ICC>0.75)and considered in subsequent analysis.Favorable interobserver and intraobserver reproducibility were achieved with these features, with intraobserver ICCs ranging from 0.750 to 0.999 and interobserver ICCs ranging from 0.752 to 0.999.There were 149 features that had no significant difference between the two groups (N0 group and NX group) were reduced.After eliminating redundant features by Spearman correlation analysis, we got 66 features.Finally, nine ALN status-related features were selected by LASSO regression with 10-fold cross-validation.Nine features were represented by letters A to I, details are shown in Table 2

Model validation
As shown in Table 3, there was a significant statistical difference in radiomics signature between N0 and NX ALN in the training group (p<0.001) and validation group (p<0.001).As shown in Figure 3, the radiomics signature achieved an AUC of 0.929 (95% Diagram shows workflow of modeling for ALN status prediction in patients with breast cancer.CI, 0.881-0.978)and AUC of 0.919 (95%CI, 0.841-0.997) in training and validation cohorts respectively.Meanwhile, the ROC curves of two radiologists (R1 and R2) were drawn for comparison, R1 achieved the AUC of 0.782 (95%CI, 0.692-0.873)and 0.682 (95% CI, 0.521-0.842) in the training and validation group respectively, R2 achieved the AUC of 0.833 (95%CI, 0.750-0.916)and 0.738 (95% CI, 0.589-0.888) in the training and validation group respectively.Based on the Youden index, the threshold of the total points to predict ALN status was determined to be 0.902.As shown in test for two correlated ROC curves was conducted, and the radiomics model performed better than the two experts' prediction of ALN status in both cohorts (P<0.05).

Discussion
In this study, we constructed and validated a model based on features derived from US images of ALN for the prediction of ALN status in breast cancer patients.This method is convenient and easy to conduct, which might help in making precise decisions for each patient.
Ultrasound is a common method to evaluate lymph node involvement in breast cancer patients.The sensitivity was reported between 49% to 87%, while the specificity was between 55% to 97% (25).In this study, both shape and intensity features were extracted from images, and the feature selection method removed all shape features.Radiologits mainly relied on the shape of lesions, while our selected features are based on intensity which might be ignored during the daily clinical operation.The performances of radiologists from two different hospitals were less effective than our model, which indicated that our radiomics signature performed better than the routine US-guide ALN examination.
Previous studies indicated that the same model could have different efficiencies among the different molecular subtypes in patients with breast cancer.M L G Vane et al. found a significant difference in negative predictive value (NPV) between triple-negative tumors and HER2+ tumors and between HER2+ and ER/PR+HER2-tumors in the axillary US examination (26).Jie Fei et al. found the ultrasound performance in the triple-negative subtype had the lowest positive predictive value for ALN status (73.2%) (27).Our model achieved good performance in the HR+/HER2-, HER2+, and triple-negative subgroups.
There is no uniform standard among studies related to clinicopathological factors, which might serve as an independent risk factor for the prediction of ALN status in breast cancer.M P Budzik et al. found the hormone receptor status and HER2 expression seemed to be related to the regional lymph node involvement (pN0-pN4) of malignant tumors (28).Illyes et al. found that primary tumors sized greater than 20 mm were significantly associated with a higher incidence of SLN metastasis (p<0.001), while primary tumors sized greater than 26 mm were associated with additional positive non-SLN (p>0.001)(29).In our study, no clinicopathological indicators were used to build a prediction model.In univariate analysis, tumor size greater than 30mm was associated with SLN metastasis in the training group, but the difference was not significant in the multivariate analysis, this was consistent with Nicla La Verde's study (30).
Despite plenty of studies using US parameters or image features of breast lesions to predict ALN status (31), most of them developed prediction models using radiomics based on images of breast lesions, while few of them concentrated on imaging features of ALN (32)(33)(34).Our prediction model has achieved a good diagnostic performance by using the radiomics signature derived from the US image of lymph nodes, which could be considered as an evaluation indicator when surgeons make plans specific to a patient's situation.
Our study has some limitations.Firstly. it is a retrospective study that collected data from only two hospitals, a small number of patients and lymph nodes were selected.Secondly, There were three patients, for them the characteristics of bilateral lymph nodes in the same patient were considered independent.We believe patients can have both normal and metastatic lymph nodes or the tumor heterogeneity could happen in one patient.Still, it is possible to cause some potential bias.Besides, our research is focused on qualitative analysis based on US images of ALN, not quantitative analysis of ALN metastasis burden in breast cancer.Therefore, Multicenter studies incorporating more patients should be considered in future research and we should strive to advance qualitative research to quantitative research.
In summary, we built a model based on ALN images to predict ALN status in breast cancer, which might provide vital information for   precise diagnosis and treatment based on ordinary examination.It also should be noted that a higher level of evidence is required before any breast surgery recommendation could be entirely based on it.

FIGURE 1 Flow
FIGURE 1Flow chart of the study population.US, ultrasound; ALN, axillary lymph nodes.

Table 4 ,
The radiomics model achieved an accuracy, sensitivity,

TABLE 2
Radiomic features selection result.

TABLE 1
Clinical characteristics of patients in the training and validation cohorts.
Data expressed as n (%), unless otherwise stated.US, ultrasound; ALN, axillary lymph node; RS, radiomics signature; FDR, false discovery rate.¶ By the Mann-Whitney U test.§ By the Chi-square test.

TABLE 3
Clinical characteristics in training and validation cohort between N0 and NX.
Data expressed as n (%), unless otherwise stated.US, ultrasound; RS, radiomics signature; FDR, false discovery rate.¶ By the Mann-Whitney U test.§ By the c2 test.

TABLE 4
Performance of the radiomics model and two radiologists for predicting ALN status in training and validation groups.

TABLE 5
Performance of the radiomics model and two radiologists for predicting ALN status in subgroups.